
IN THE CLAIMS 

Please amend the claims as follows: 

1 . (Currently Amended) A method for performing a gather operation on a 
computer processor comprising: 

computing addresses for a plurality of data elements of a matrix stored in 
memory, ut il izing w herein each data element is identified by one of a plurality of 
indices and a base address , and wherein computing addresses comprises 
executing a first plurality of instructions to transfer a plurality of said indices from 
a first storage location where the indices are stored substantially contiguously, to 
an egual plurality of separate storage locations, wherein each index is assigned 
its own separate storage location ; 

retrieving each of said data elements from memory based on the 
computed addresses; and 

executing a second plurality of instructions, each instruction depositing 
one or more of said data elements contiguously with other data elements in a 
second storage location. 

2. (Original) The method as in claim 1 wherein said storage locations are 
registers. 

3. (Currently Amended) The method as in claim 1 wherein computing 
addresses further comprises: 

e xtract i ng i nd i c e s for e ach of sa i d data e l e m e nts i nto s e parat e storag e 
l ocat i ons; and 
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adding each of said indices to a base address. 

4. (Currently Amended) The method as in claim 1 further comprising: 
loading each of said data elements from memory into separate storage 

locations prior to executing said second plurality of instructions. 

5. (Currently Amended) The method as in claim 1 wherein said computer 
processor executes two or more of said first and/or second plurality of 
instructions in a single clock cycle. 

6. (Original) The method as in claim 1 further comprising: 
storing each of said data elements on a mass storage device. 

7. (Original) The method as in claim 2 wherein said registers are 64-bits 
wide and said data elements are 16-bits in length. 

8. (Currently Amended) A method for performing a scatter operation on a 
computer processor comprising: 

calculating addresses in memory to which a plurality of data elements are 
to be scattered to form a matrix in memory ut il izing wherein each address in 
memory is identified by one of a plurality of indices and a base address; 

executing a plurality of extract instructions, each of said extract 
instructions extracting one or more of said data elements from a storage location 
in which said data elements are stored contiguously to an equal plurality of 
separate storage locations : and 
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stor i ng transferring said data elements from said separate storage 
locations to said calculated addresses in memory. 

9. (Currently Amended) The method as in claim 8 wherein each of said 
storage location is a register. 

10. (Previously Presented) The method as in claim 8 wherein calculating 
addresses comprises: 

extracting indices for each of said data elements into separate storage 
locations; and 

adding each of said indices to a base address. 

1 1 . (Previously Presented) The method as in claim 8 wherein storing each 
of said data elements is accomplished via a plurality of STORE instructions 
executed by said computer processor. 

12. (Previously Presented) The method as in claim 8 wherein said 
computer processor executes two or more of said instructions in a single clock 
cycle. 

13. (Original) The method as in claim 9 wherein said register is 64-bits 
wide and said data elements are 16-bits in length. 

14. (Currently Amended) A computer system comprising: 
a memory; 

a processor communicatively coupled to the memory; and 
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a storage device communicatively coupled to the processor and h&ving 
stored therein a sequence of instructions which, when executed by the 
processor, causes the processor to at least, 

compute addresses for a plurality of data elements of a matrix stored in 
memory^ ut ili zing wherein each data element is identified by one of a plurality of 
indices and a base address , and wherein computing addresses comprises 
executing a first plurality of instructions to transfer a plurality of said indices from 
a first storage location where the indices are stored substantially contiguously, to 
an egual plurality of separate storage locations, wherein each index is assigned 
its own separate storage location ; 

retrieve each of said data elements from memory based on the computed 
addresses; and 

execute a second plurality of instructions, each instruction to deposit one 
or more of said data elements contiguously with other data elements in a second 
storage location. 

15. (Original) The computer system as in claim 14 wherein said storage 
locations are registers. 

16. (Currently Amended) The computer system as in claim 14 wherein, 
responsive to one or more instructions in said sequence, said processor 
computes addresses by: 

e xtract i ng i nd i ces for e ach of said data e l e m e nts i nto s e parat e storag e 
l ocations; and 

adding each of said indices to a base address. 
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17. (Currently Amended) The computer system as in claim 14 wherein 
said processor loads each of said data elements from memory into separate 
storage locations prior to executing said second plurality of DEPOS I T 
instructions. 

18. (Currently Amended) The computer system as in claim 17 wherein 
said processor executes two or more of said first and/or second plurality of 
instructions in a single clock cycle. 

V 19. (Original) The computer system as in claim 14 wherein, responsive to 



stores each of said data elements on said mass storage device. 

20. (Original) The computer system as in claim 15 wherein said registers are 
64-bits wide and said data elements are 16-bits in length. 

21. (Previously Presented) A method as in claim 1 wherein computing 
addresses comprises: 

executing a series of instructions, each instruction to extract an address index 
for one of said plurality of data elements. 

22. (Original) The method as in claim 21 wherein said address indices are 
extracted from a series of contiguous memory locations 

23. (Cancelled) 




one or more instructions in said sequence, said processor further: 
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